<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Database and log file archival</title>
    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
    <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
    <link rel="up" href="transapp.html" title="Chapter 12.  Berkeley DB Transactional Data Store Applications" />
    <link rel="prev" href="transapp_checkpoint.html" title="Checkpoints" />
    <link rel="next" href="transapp_logfile.html" title="Log file removal" />
  </head>
  <body>
    <div xmlns="" class="navheader">
      <div class="libver">
        <p>Library Version 12.1.6.2</p>
      </div>
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">Database and log file
        archival</th>
        </tr>
        <tr>
          <td width="20%" align="left"><a accesskey="p" href="transapp_checkpoint.html">Prev</a> </td>
          <th width="60%" align="center">Chapter 12.  Berkeley DB Transactional Data Store Applications </th>
          <td width="20%" align="right"> <a accesskey="n" href="transapp_logfile.html">Next</a></td>
        </tr>
      </table>
      <hr />
    </div>
    <div class="sect1" lang="en" xml:lang="en">
      <div class="titlepage">
        <div>
          <div>
            <h2 class="title" style="clear: both"><a id="transapp_archival"></a>Database and log file
        archival</h2>
          </div>
        </div>
      </div>
      <p> 
        The third component of the administrative infrastructure,
        archival for catastrophic recovery, concerns the
        recoverability of the database in the face of catastrophic
        failure. Recovery after catastrophic failure is intended to
        minimize data loss when physical hardware has been destroyed
        — for example, loss of a disk that contains databases or
        log files. Although the application may still experience data
        loss in this case, it is possible to minimize it.
    </p>
      <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
        <h3 class="title">Note</h3>
        <p>
            Berkeley DB backups (archives) can be recovered using
            machines of differing byte order. That is, a backup taken
            on a big-endian machine can be used to restore a database
            residing on a little-endian machine. 
        </p>
      </div>
      <p> 
        First, you may want to periodically create snapshots (that
        is, backups) of your databases to make it possible to recover
        from catastrophic failure. These snapshots are either a
        standard backup, which creates a consistent picture of the
        databases as of a single instant in time; or an on-line backup
        (also known as a <span class="emphasis"><em>hot</em></span> backup), which
        creates a consistent picture of the databases as of an
        unspecified instant during the period of time when the
        snapshot was made. The advantage of a hot backup is that
        applications may continue to read and write the databases
        while the snapshot is being taken. The disadvantage of a hot
        backup is that more information must be archived, and recovery
        based on a hot backup is to an unspecified time between the
        start of the backup and when the backup is completed.
    </p>
      <p>
        Second, after taking a snapshot, you should periodically
        archive the log files being created in the environment. It is
        often helpful to think of database archival in terms of full
        and incremental filesystem backups. A snapshot is a full
        backup, whereas the periodic archival of the current log files
        is an incremental backup. For example, it might be reasonable
        to take a full snapshot of a database environment weekly or
        monthly, and archive additional log files daily. Using both
        the snapshot and the log files, a catastrophic crash at any
        time can be recovered to the time of the most recent log
        archival; a time long after the original snapshot. 
    </p>
      <p>
        When incremental backups are implemented using this
        procedure, it is important to know that a database copy taken
        prior to a bulk loading event (that is, a transaction started
        with the <a href="../api_reference/C/txnbegin.html#txnbegin_DB_TXN_BULK" class="olink">DB_TXN_BULK</a> flag) can no longer be used as the
        target of an incremental backup. This is true because bulk
        loading omits logging of some record insertions, so these
        insertions cannot be rolled forward by recovery. It is
        recommended that a full backup be scheduled following a bulk
        loading event.
    </p>
      <p> 
        To create a standard backup of your database that can be
        used to recover from catastrophic failure, take the following
        steps:
    </p>
      <div class="orderedlist">
        <ol type="1">
          <li>
            Commit or abort all ongoing transactions. 
        </li>
          <li> 
            Stop writing your databases until the backup has
            completed. Read-only operations are permitted, but no
            write operations and no filesystem operations may be
            performed (for example, the <a href="../api_reference/C/envremove.html" class="olink">DB_ENV-&gt;remove()</a> and <a href="../api_reference/C/dbopen.html" class="olink">DB-&gt;open()</a>
            methods may not be called). 
        </li>
          <li> 
            Force an environment checkpoint (see the
            <a href="../api_reference/C/db_checkpoint.html" class="olink">db_checkpoint</a> utility for more information). 
        </li>
          <li>
            <p> 
                Run the <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility with option <span class="bold"><strong>-s</strong></span> to identify all the
                database data files, and copy them to a backup device
                such as CD-ROM, alternate disk, or tape.
            </p>
            <p>
                If the database files are stored in a separate
                directory from the other Berkeley DB files, it may be
                simpler to archive the directory itself instead of the
                individual files (see <a href="../api_reference/C/envadd_data_dir.html" class="olink">DB_ENV-&gt;add_data_dir()</a> for additional
                information on storing database files in separate
                directories). 
            </p>
            <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
              <h3 class="title">Note</h3>
              <p>
                    If any of the database files did not have an
                    open <a href="../api_reference/C/db.html" class="olink">DB</a> handle during the lifetime of the
                    current log files, the <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility will not list
                    them in its output. This is another reason it may
                    be simpler to use a separate database file
                    directory and archive the entire directory instead
                    of archiving only the files listed by the
                    <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility.
                </p>
            </div>
          </li>
          <li>
            Run the <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility with option <span class="bold"><strong>-l</strong></span> to identify all the log
            files, and copy the last one (that is, the one with the
            highest number) to a backup device such as CD-ROM,
            alternate disk, or tape. 
        </li>
        </ol>
      </div>
      <p>
        To create a <span class="emphasis"><em>hot</em></span> backup of your
        database that can be used to recover from catastrophic
        failure, take the following steps:
    </p>
      <div class="orderedlist">
        <ol type="1">
          <li>
            Set the <a href="../api_reference/C/envset_flags.html#set_flags_DB_HOTBACKUP_IN_PROGRESS" class="olink">DB_HOTBACKUP_IN_PROGRESS</a> flag in the
            environment. This affects the behavior of transactions
            started with the <a href="../api_reference/C/txnbegin.html#txnbegin_DB_TXN_BULK" class="olink">DB_TXN_BULK</a> flag.
        </li>
          <li>
            <p>
                Archive your databases, as described in the
                previous step #4. You do not have to halt ongoing
                transactions or force a checkpoint. As this is a hot
                backup, and the databases may be modified during the
                copy, it is critical that database pages be read
                atomically as described by <a class="xref" href="transapp_reclimit.html" title="Berkeley DB recoverability">Berkeley DB recoverability</a>. </p>
            <p>
                Note that only UNIX based systems are known to
                support the atomicity of reads. These systems include:
                Solaris, Mac OSX, HPUX and various BSD based systems.
                Linux and Windows based systems do not support atomic
                filesystem reads directly. The XFS file system
                supports atomic reads despite the lack of it in Linux.
                On systems that do not support atomic file system
                reads, the <a href="../api_reference/C/db_hotbackup.html" class="olink">db_hotbackup</a> utility should be used or a tool can
                be constructed using the <a href="../api_reference/C/envbackup.html" class="olink">DB_ENV-&gt;backup()</a> method.
                Alternatively, you can construct a tool using the the
                <a href="../api_reference/C/db_copy.html" class="olink">db_copy()</a> method. You can also perform a hot backup of
                just a single database in your environment using the
                <a href="../api_reference/C/envdbbackup.html" class="olink">DB_ENV-&gt;dbbackup()</a> method.
            </p>
          </li>
          <li>
            <p> 
                Archive <span class="bold"><strong>all</strong></span> of the
                log files. The order of these two operations is
                required, and the database files must be archived
                <span class="bold"><strong>before</strong></span> the log
                files. This means that if the database files and log
                files are in the same directory, you cannot simply
                archive the directory; you must make sure that the
                correct order of archival is maintained.
            </p>
            <p>
                To archive your log files, run the <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility
                using the <span class="bold"><strong>-l</strong></span> option
                to identify all the database log files, and copy them
                to your backup media. If the database log files are
                stored in a separate directory from the other database
                files, it may be simpler to archive the directory
                itself instead of the individual files (see the
                <a href="../api_reference/C/envset_lg_dir.html" class="olink">DB_ENV-&gt;set_lg_dir()</a> method for more information).
            </p>
          </li>
          <li> Reset the <a href="../api_reference/C/envset_flags.html#set_flags_DB_HOTBACKUP_IN_PROGRESS" class="olink">DB_HOTBACKUP_IN_PROGRESS</a> flag.
        </li>
        </ol>
      </div>
      <p>
        To minimize the archival space needed for log files when
        doing a hot backup, run db_archive to identify those log files
        which are not in use. Log files which are not in use do not
        need to be included when creating a hot backup, and you can
        discard them or move them aside for use with previous backups
        (whichever is appropriate), before beginning the hot backup. 
    </p>
      <p> 
        After completing one of these two sets of steps, the
        database environment can be recovered from catastrophic
        failure (see <a class="xref" href="transapp_recovery.html" title="Recovery procedures">Recovery procedures</a> for more information).
    </p>
      <p>
        To update either a hot or cold backup so that recovery from
        catastrophic failure is possible to a new point in time,
        repeat step #2 under the hot backup instructions and archive
        <span class="bold"><strong>all</strong></span> of the log files in
        the database environment. Each time both the database and log
        files are copied to backup media, you may discard all previous
        database snapshots and saved log files. Archiving additional
        log files does not allow you to discard either previous
        database snapshots or log files. Generally, updating a backup
        must be integrated with the application's log file removal
        procedures.
    </p>
      <p> 
        The time to restore from catastrophic failure is a function
        of the number of log records that have been written since the
        snapshot was originally created. Perhaps more importantly, the
        more separate pieces of backup media you use, the more likely
        it is that you will have a problem reading from one of them.
        For these reasons, it is often best to make snapshots on a
        regular basis.
    </p>
      <p>
        <span class="bold"><strong>Obviously, the reliability of your
        archive media will affect the safety of your data. For
        archival safety, ensure that you have multiple copies of
        your database backups, verify that your archival media is
        error-free and readable, and that copies of your backups
        are stored offsite!</strong></span>
    </p>
      <p> 
        The functionality provided by the <a href="../api_reference/C/db_archive.html" class="olink">db_archive</a> utility is also
        available directly from the Berkeley DB library. The following
        code fragment prints out a list of log and database files that
        need to be archived:
    </p>
      <pre class="programlisting">void
log_archlist(DB_ENV *dbenv)
{
    int ret;
    char **begin, **list;

    /* Get the list of database files. */
    if ((ret = dbenv-&gt;log_archive(dbenv,
        &amp;list, DB_ARCH_ABS | DB_ARCH_DATA)) != 0) {
        dbenv-&gt;err(dbenv, ret, "DB_ENV-&gt;log_archive: DB_ARCH_DATA");
        exit (1);
    }
    if (list != NULL) {
        for (begin = list; *list != NULL; ++list)
            printf("database file: %s\n", *list);
        free (begin);
    }

    /* Get the list of log files. */
    if ((ret = dbenv-&gt;log_archive(dbenv,
        &amp;list, DB_ARCH_ABS | DB_ARCH_LOG)) != 0) {
        dbenv-&gt;err(dbenv, ret, "DB_ENV-&gt;log_archive: DB_ARCH_LOG");
        exit (1);
    }
    if (list != NULL) {
        for (begin = list; *list != NULL; ++list)
            printf("log file: %s\n", *list);
        free (begin);
    }
}</pre>
    </div>
    <div class="navfooter">
      <hr />
      <table width="100%" summary="Navigation footer">
        <tr>
          <td width="40%" align="left"><a accesskey="p" href="transapp_checkpoint.html">Prev</a> </td>
          <td width="20%" align="center">
            <a accesskey="u" href="transapp.html">Up</a>
          </td>
          <td width="40%" align="right"> <a accesskey="n" href="transapp_logfile.html">Next</a></td>
        </tr>
        <tr>
          <td width="40%" align="left" valign="top">Checkpoints </td>
          <td width="20%" align="center">
            <a accesskey="h" href="index.html">Home</a>
          </td>
          <td width="40%" align="right" valign="top"> Log file removal</td>
        </tr>
      </table>
    </div>
  </body>
</html>
